Enhancing Information Extraction with Context and Inference: The ODIX Platform

نویسندگان

  • Hisham Assal
  • Franz Kurfess
  • Kym Pohl
  • John Seng
چکیده

Natural Language Processing (NLP) provides tools to extract explicitly stated information from text documents. These tools include Named Entity Recognition (NER) and Parts-Of-Speech (POS). The extracted information represents discrete entities in the text and some relationships that may exist among them. To perform intelligent analysis on the extracted information a context has to exist in which this information is placed. The context provides an environment to link information that is extracted from multiple documents and offers a big picture of the domain. Analysis can then be provided by adding inference capabilities to the environment. The ODIX platform provides an environment for bringing together information extraction, ontology, and intelligent analysis. The platform design relies on existing NLP tools to provide the information extraction capabilities. It also utilizes a Web crawler to collect text documents from the Web. The context is provided by a domain ontology that is loaded at run time. The ontology offers limited inference capabilities and external intelligent agents offer more advanced reasoning capabilities. User involvement is key to the success of the analysis process. At every step of the process, the user has the opportunity to direct the system, set selection criteria, correct errors, or add additional information. John Seng California Polytechnic State University, USA

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attaques d'inférence sur des bases de données géolocalisées. (Inference attacks on geolocated data)

In recent years, we have observed the development of connected and nomad devices such as smartphones, tablets or even laptops allowing individuals to use location-based services (LBSs), which personalize the service they offer according to the positions of users, on a daily basis. Nonetheless, LBSs raise serious privacy issues, which are often not perceived by the end users. In this thesis, we ...

متن کامل

Discovering the Underlying Components Affecting the Usability of IoT in Iranian Libraries: A Theory Based on Context

Objective: The aim is to discover the underlying context components of IOT usability in Iranian libraries: A qualitative approach consistent with grounded theory. Method: This qualitative study was conducted based on grounded theory. Data were collected through semi-structured interviews with 13 faculty members of knowledge and information science based on purposeful and chain methods. Responsi...

متن کامل

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

Information Extraction using Context-free Grammatical Inference from Positive Examples

Information extraction from textual data has various applications, such as semantic search. Learning from positive example have theoretical limitations, for many useful applications (including natural languages), substantial part of practical structure (CFG) can be captured by framework introduced in this paper. Our approach to automate identification of structural information is based on gramm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017